Human behavior emerges from planning over elaborate decompositions of tasks into goals, subgoals, and low-level actions. How are these decompositions created and used? Here, we propose and evaluate a normative framework for task decomposition based on the simple idea that people decompose tasks to reduce the overall cost of planning while maintaining task performance. Analyzing 11,117 distinct graph-structured planning tasks, we find that our framework justifies several existing heuristics for task decomposition and makes predictions that can be distinguished from two alternative normative accounts. We report a behavioral study of task decomposition ($N=806$) that uses 30 randomly sampled graphs, a larger and more diverse set than that of any previous behavioral study on this topic. We find that human responses are more consistent with our framework for task decomposition than alternative normative accounts and are most consistent with a heuristic -- betweenness centrality -- that is justified by our approach. Taken together, our results provide new theoretical insight into the computational principles underlying the intelligent structuring of goal-directed behavior.
translated by 谷歌翻译
We present an extension to masked autoencoders (MAE) which improves on the representations learnt by the model by explicitly encouraging the learning of higher scene-level features. We do this by: (i) the introduction of a perceptual similarity term between generated and real images (ii) incorporating several techniques from the adversarial training literature including multi-scale training and adaptive discriminator augmentation. The combination of these results in not only better pixel reconstruction but also representations which appear to capture better higher-level details within images. More consequentially, we show how our method, Perceptual MAE, leads to better performance when used for downstream tasks outperforming previous methods. We achieve 78.1% top-1 accuracy linear probing on ImageNet-1K and up to 88.1% when fine-tuning, with similar results for other downstream tasks, all without use of additional pre-trained models or data.
translated by 谷歌翻译
Artificial intelligence methods including deep neural networks (DNN) can provide rapid molecular classification of tumors from routine histology with accuracy that matches or exceeds human pathologists. Discerning how neural networks make their predictions remains a significant challenge, but explainability tools help provide insights into what models have learned when corresponding histologic features are poorly defined. Here, we present a method for improving explainability of DNN models using synthetic histology generated by a conditional generative adversarial network (cGAN). We show that cGANs generate high-quality synthetic histology images that can be leveraged for explaining DNN models trained to classify molecularly-subtyped tumors, exposing histologic features associated with molecular state. Fine-tuning synthetic histology through class and layer blending illustrates nuanced morphologic differences between tumor subtypes. Finally, we demonstrate the use of synthetic histology for augmenting pathologist-in-training education, showing that these intuitive visualizations can reinforce and improve understanding of histologic manifestations of tumor biology.
translated by 谷歌翻译
The selection of an optimal pacing site, which is ideally scar-free and late activated, is critical to the response of cardiac resynchronization therapy (CRT). Despite the success of current approaches formulating the detection of such late mechanical activation (LMA) regions as a problem of activation time regression, their accuracy remains unsatisfactory, particularly in cases where myocardial scar exists. To address this issue, this paper introduces a multi-task deep learning framework that simultaneously estimates LMA amount and classify the scar-free LMA regions based on cine displacement encoding with stimulated echoes (DENSE) magnetic resonance imaging (MRI). With a newly introduced auxiliary LMA region classification sub-network, our proposed model shows more robustness to the complex pattern cause by myocardial scar, significantly eliminates their negative effects in LMA detection, and in turn improves the performance of scar classification. To evaluate the effectiveness of our method, we tests our model on real cardiac MR images and compare the predicted LMA with the state-of-the-art approaches. It shows that our approach achieves substantially increased accuracy. In addition, we employ the gradient-weighted class activation mapping (Grad-CAM) to visualize the feature maps learned by all methods. Experimental results suggest that our proposed model better recognizes the LMA region pattern.
translated by 谷歌翻译
Prior work on ideology prediction has largely focused on single modalities, i.e., text or images. In this work, we introduce the task of multimodal ideology prediction, where a model predicts binary or five-point scale ideological leanings, given a text-image pair with political content. We first collect five new large-scale datasets with English documents and images along with their ideological leanings, covering news articles from a wide range of US mainstream media and social media posts from Reddit and Twitter. We conduct in-depth analyses of news articles and reveal differences in image content and usage across the political spectrum. Furthermore, we perform extensive experiments and ablation studies, demonstrating the effectiveness of targeted pretraining objectives on different model components. Our best-performing model, a late-fusion architecture pretrained with a triplet objective over multimodal content, outperforms the state-of-the-art text-only model by almost 4% and a strong multimodal baseline with no pretraining by over 3%.
translated by 谷歌翻译
视频中的人类对象相互作用(HOI)识别对于分析人类活动很重要。在现实世界中,大多数关注视觉特征的工作通常都会受到阻塞。当HOI中有多个人和物体涉及时,这种问题将更加复杂。考虑到诸如人类姿势和物体位置之类的几何特征提供有意义的信息来了解HOI,我们认为将视觉和几何特征的好处结合在HOI识别中,并提出了一个新颖的两级几何形状特征信息信息图形卷积(2G) -GCN)。几何级图模拟了人类和对象的几何特征之间的相互依赖性,而融合级别的图将它们与人类和对象的视觉特征融合在一起。为了证明我们方法在挑战性场景中的新颖性和有效性,我们提出了一个新的多人HOI数据集(Mphoi-72)。关于Mphoi-72(多人HOI),CAD-1220(单人HOI)和双人动作(双手HOI)数据集的广泛实验证明了我们的表现与最先进的表现相比。
translated by 谷歌翻译
本文研究了使用风险模型来预测电力基础设施引起的野火的时间和位置。我们的数据包括由2015年至2019年间在太平洋天然气和电力领域收集的网格基础设施触发的历史点火和降线点,以及各种天气,植被以及网格基础设施的高分辨率数据,包括位置,年龄,材料。通过这些数据,我们探讨了一系列机器学习方法和管理培训数据不平衡的策略。我们获得的接收器操作特性下的最佳区域为0.776,用于分配馈线点火器,传输线向下事件为0.824,均使用基于直方图的梯度增强树算法(HGB),并带有下采样。然后,我们使用这些模型来确定哪些信息提供了最预测的价值。线长度后,我们发现天气和植被特征主导着点火或降线风险的最重要功能。分配点火模型显示出更大的依赖性对慢变化的植被变量,例如燃烧指数,能量释放含量和树高度,而传输线模型更多地依赖于主要天气变量,例如风速和降水量。这些结果表明,改进的植被建模对进料机点火风险模型的重要性,以及对传输线模型的天气预测改进。我们观察到,基础架构功能可以对风险模型预测能力进行较小但有意义的改进。
translated by 谷歌翻译
数据不平衡,其中多个数据样本来自一小部分标签,在训练深层神经网络方面构成了挑战。与分类不同,在回归中,标签是连续的,潜在的无限,并形成自然排序。回归的这些独特功能要求采用新技术,以利用标签空间关系中编码的其他信息。本文介绍了深度不平衡回归的Ranksim(排名相似性)正常化程序,该调节器编码一种感应偏置,该偏差在标签空间中更接近的样品在特征空间中也应该更接近。与最近基于分布平滑的方法相反,RankSIM捕获附近和遥远的关系:对于给定的数据样本,RankSIM鼓励其在标签空间中的邻居排序列表,以匹配其特征空间中邻居的排序列表。 Ranksim与常规不平衡的学习技术相辅相成,包括重新加权,两阶段培训和分配平滑,并在三个不平衡的回归基准上提高最先进的性能:IMDB-WIKI-DIR,年龄B-DIR,AgeDB-DIR,AgeDB-DIR,AGENDB-DIR,EAGENDB-DIR,,和STS-B-DIR。
translated by 谷歌翻译
识别有影响力的培训示例的能力使我们能够调试培训数据并解释模型行为。现有的技术是基于通过模型参数来影响训练数据影响的。对于NLP应用中的大型模型,在所有模型参数中研究此流程通常是不可行的,因此技术通常选择重量的最后一层。但是,我们观察到,由于激活连接到最后一层的权重包含``共享逻辑'',因此通过最后一层权重计算的数据容易``取消效应'',其中不同示例的数据影响不同的示例的数据影响彼此相矛盾的大级级。取消效应降低了影响评分的歧视力,并且根据此措施删除有影响力的例子通常不会太多改变模型的行为。为了减轻这种情况,我们提出了一种称为Tracin的技术,我们可以修改一种称为Tracin的方法,可以在嵌入层而不是最后一层中进行操作,在该层中,取消效果不太严重。一个潜在的问题是,基于单词嵌入层的影响可能无法编码足够的高级信息。但是,我们发现梯度(与嵌入不同)不会遭受这一影响,这可能是因为它们通过较高的层链。我们表明,在三个语言分类任务上,在案例删除评估上,Tracin-We明显优于4-10在上一层上应用的其他数据影响的其他数据影响方法。此外,Tracin-We不仅可以在整体培训输入水平上产生分数,而且还可以在培训输入中的单词水平上产生分数,这是进一步的调试。
translated by 谷歌翻译
建模空间关系对于识别人类行为,尤其是当人类与物体相互作用时,而多个物体随着时间的推移会随着时间的推移而出现多个物体。大多数现有的行动识别模型专注于学习场景的整体视觉线索,而是无视内容的内容细粒度,可以通过学习人对象关系和互动来捕获。在本文中,我们通过利用当地和全球背景的互动来学习人对象关系。因此,我们提出了全球局部相互作用蒸馏网(GLIDN),通过空间和时间通过知识蒸馏来学习人和对象相互作用,以进行细粒度的现场理解。 Glidn将人和对象编码为Graph节点,并通过图注意网络了解本地和全球关系。本地上下文图通过在特定时间步骤中捕获它们的共同发生来了解帧级别的人类和对象之间的关系。全局关系图是基于人类和对象交互的视频级构建的,识别它们在视频序列中的长期关系。更重要的是,我们研究了如何将这些图表的知识如何蒸馏到它们的对应部分,以改善人对象相互作用(Hoi)识别。通过在两个数据集上进行全面的实验,我们评估我们的模型,包括Charades和CAD-120数据集。我们已经实现了比基线和对应方法更好的结果。
translated by 谷歌翻译